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Abstract 

In this paper, we assess the performance of four iterative algorithms for solving non-symmetric 
rank-deficient linear systems arising in the FFT-based homogenization of heterogeneous mate¬ 
rials defined by digital images. Our framework is based on the Fourier-Galerkin method with 
exact and approximate integrations that has recently been shown to generalize the Lippmann- 
Schwinger setting of the original work by Moulinec and Suquet from 1994. It follows from this 
variational format that the ensuing system of linear equations can be solved by general-purpose 
iterative algorithms for symmetric positive-definite systems, such as the Richardson, the Con¬ 
jugate gradient, and the Chebyshev algorithms, that are compared here to the Eyre-Milton 
scheme—the most efficient specialized method currently available. Our numerical experiments, 
carried out for two-dimensional elliptic problems, reveal that the Conjugate gradient algorithm is 
the most efficient option, while the Eyre-Milton method performs comparably to the Chebyshev 
semi-iteration. The Richardson algorithm, equivalent to the still widely used original Moulinec- 
Suquet solver, exhibits the slowest convergence. Besides this, we hope that our study highlights 
the potential of the well-established techniques of numerical linear algebra to further increase 
the efficiency of EET-based homogenization methods. 

Keywords: Numerical homogenization, Eourier-Calerkin method. East Eourier Transform, 
guaranteed bounds, Richardson iteration. Conjugate gradient algorithm, Chebyshev 
semi-iterative method, Eyre-Milton scheme 


1. Introduction 

Various experimental and simulation techniques, such as serial sectioning [1], computed 
tomography [2], statistical reconstruction [3], or digital models [4] are currently available to 
characterize microstructures of heterogeneous materials in a degree of realism not possible before. 
When combined with the tools of homogenization theories, e.g. [5, 6, 7], these advances have 
made it possible to establish the structure-property relations of complex engineering materials 
across length scales ranging from micrometers to tens of centimeters. The scale transitions rely 
on the solution of the corrector problem - a boundary value problem defined on a representative 
cell of the material, typically involving periodic boundary conditions. Since the input data are 
provided in the form of pixel- or voxel-based geometries, the need therefore arises for efficient 
solvers that employ images as discretization grids. Although several finite element or finite 
difference solvers have been developed for this purpose (e.g. [8, 9, 10]) methods based on the East 
Eourier Transform (EET) generally offer the best computational efficiency, because of the regular 
grid, the simple shape of the computational domain, and the periodic boundary conditions. 
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In the field of computational micromechanics of materials, the first FFT-based homogeniza¬ 
tion solver was proposed by Moulinec and Suquet in 1994 [11] and more than twenty years later, 
it is still widely used because of its simplicity and computational speed. The crux of the method 
is to reformulate the corrector problem as an integral equation of the Lippmann-Schwinger type 
solved by fixed-point iterations, while taking advantage of the fact that the kernel action can be 
efficiently handled using FFT. Later extensions of the basic algorithm were driven by the need 
to (i) accelerate its convergence for high-contrast problems [12, 13, 14, 15, 16]; (ii) to increase 
accuracy of local fields by incorporating inclusion shapes [17], modified kernels [18, 19], or local 
smoothing of coefficients [20, 21]; and to (iii) prove the convergence of approximate solutions in 
the framework of spectral collocation methods [22, 23, 24], the Galerkin discretization of the non- 
classical Hashin-Shtrikman functionals with piecewise-constant approximation spaces [25, 26], 
and standard Fourier-Galerkin methods [23]. 

Apart from providing theoretical justification to the original scheme, the Fourier-Galerkin 
setting has also been found convenient from the numerical point of view. For instance, it 
has clarified the effects of numerical quadrature [27], and led to the development of fully ex¬ 
plicit guaranteed error bounds on homogenized properties based on a primal-dual variational 
approach [23, 28], which were later shown to be more restrictive than the corresponding Hashin- 
Shtrikman bounds [29] . The purpose of this paper is to complement these studies by examining 
the performance of four low-memory iterative methods for solving linear systems associated with 
the Fourier-Galerkin discretizations. Our comparison involves general-purpose short-recurrence 
solvers, namely the Richardson scheme [30], the Conjugate gradient method [31], the Chebyshev 
semi-iteration [32], together with the Eyre-Milton algorithm [12] - the most efficient of the ac¬ 
celerated schemes developed specifically for FFT-based homogenization problems, according to 
the recent study [33]. 

Related work. Previous comparative studies on FFT-based homogenization algorithms fall into 
two categories. The aim of the first group of works is to compare their results with finite 
element solvers for material-specific applications, such as particle-reinforced composites with 
elasto-plastic phases [34], visco-plastic models of polycrystalline materials [35, 36, 37], or trans¬ 
port processes and creep in concrete-like materials [38, 39]. Results of these studies consistently 
reveal that FFT-based methods offer at least an order-of-magnitude improvement in the com¬ 
putational time while predicting very similar distributions of local fields. The second group 
of studies is dedicated to accelerated schemes, namely to benchmarking their computational 
performance for high-contrast problems [40] and to revealing that they can be derived from a 
common recurrence relation [33]. 

Contributions. Although considerable effort has been spent on benchmarking FFT-based al¬ 
gorithms, neither of the studies above addresses conventional iterative solvers for symmetric 
positive-definite systems, the applicability of which follows naturally from the Fourier-Galerkin 
setting [41, 23]. We aim to fill this gap while utilizing the standard techniques and results of 
numerical linear algebra. In particular, we discuss in detail the (i) eigenvalue distribution of 
the system matrix, (ii) effects of numerical integration, and reduction in (iii) algebraic errors 
and (iv) guaranteed bounds on homogenized properties during iterations. To the best of our 
knowledge, this is the first study addressing such aspects for FFT-based homogenization solvers. 

Limitations. Because our goal is to provide basic insight into the behavior of the different lin¬ 
ear solvers for FFT-based homogenization, we restrict our attention to the two-dimensional 
scalar linear elliptic problems with isotropic phases, moderate contrasts in coefficients, and 
discretizations not exceeding ss 3,000,000 unknowns (corresponding to a 1,999 x 1,999 pixel 
image). We also do not provide details about the overall computational time, since all sim¬ 
ulations were performed with an experimental Python-based code FFTHomPy, available at 
https://github.com/vondrejc/FFTHomPy, that is not optimized for speed. However, because 
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our observations are based on well-established results of numerical linear algebra, they extend 
directly to more involved applications of FFT-based homogenization solvers reported in the 
literature, as evidenced by recent contributions [42, 43, 44]. 

Organization of the paper. The remainder of the manuscript is organized as follows. The essen¬ 
tials of the Fourier-Galerkin discretization of the periodic corrector problem are briefly reviewed 
in Section 2 following our more detailed expositions [23, 28, 27]. In Section 3, we provide details 
for the linear iterative solvers considered in this study. Results of the numerical experiments 
are gathered in Section 4, and the paper is concluded with the summary of the most important 
findings in Section 5. 


Notation. We will denote d-dimensional vectors and matrices by boldface letters, e.g. a = 
{o-a)a=i or A = {Aai3)a,i3=i,...,d £ The Euclidean inner product will be referred 

to as and the corresponding norm as || • |||Rd. By we will refer to the space of 

symmetric positive-definite d x d matrices. 

Vectors and matrices arising from discretization on regular grids will be denoted by the bold 
serif font in order to highlight their special structures. In particular, for a parameter AI G IKI'^ 
related to the discretization along each coordinate and an index set enumerating the degrees 
of freedom, see ahead to (7) for the exact specification, we use 


Sat = a, 


fceZ' 






A]\f = (A 


a=l,...,d 


k,meZi 


J a,l3=l,...,d 


£ [R 


dxN-\2 


The corresponding matrix-vector and matrix-matrix multiplication are understood as 


mez^ 


(AjvBjv)''”" = Y iov 


and the space is endowed with the following inner product and norm 

^dxN F/VT ^ ^ II II [R<^x W [j^dxiv ; 




where |Ar| = na=i ^a- The same nomenclature is used for complex-valued quantities. 


2. Background 

The periodic corrector problem amounts to finding the matrix Ah G Rfpd^ defined implicitly 
by the variational statement, e.g. [6, Chapter 13], 

(AhT/, E'j = min a(F/ g, E g) = a(^E , E , (1) 


where E is an arbitrary vector in R*^. The bilinear form 



( 2 ) 


is defined on the space of the square-integrable R'^-valued periodic functions on the unit cell y = 
na=i(“^) ^), denoted as R'^), and involves the matrix-valued coefficients A : T —>• Rgpd^ 

that satisfy 

c^lli’ll^d < (A(®)t;, for all r? G R^ and almost all x £ y, (3) 

with 0 < CA < Ca < 4-00. 
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The minimization problem (1) is constrained to a subspace of L‘^{y; R'^) defined by 


S’ = {v e L\{y; : curl v = 0, v{x) dx = 0}, 

Jy 

which reflects the fact that admissible vectors can be expressed as the gradient of a T-periodic 
potential. This constraint can be conveniently enforced by the orthogonal projection operator 
Q : L‘^{y’, t S given by, e.g. [6, Section 12.1] or [23, Lemma 2], 


t{k)v{k)(pk, 

fceZ'* 


0 for fc = 0, 

T{k) = { fcfcT ^ ^ Z'^\{0}, 


( 4 ) 


where T{k) G |paxa a,re projection matrices in the Fourier space and v{k) G C“ is the fc-th 
Fourier coefficient of v, 

v{k)= f v{x)ip_k{x)dxe& for (^fc(a:) = exp(^27ri(fc,a;)^rf) with a; G T. (5) 
jy 

It follows from the Lax-Milgram lemma that (1) has the unique minimizer for any 
G R^ that satisfies the optimality conditions 

a(e^^\v') = —a[E,v) for all v £ S. 


In the following sections, we will explain how to obtain computable approximation to 
using Fourier-Galerkin methods. This procedure includes specification of the approximating 
functions. Section 2.1; Galerkin discretizations with approximate and exact integrations yielding 
the guaranteed upper bounds on the homogenized matrix Ah, Section 2.2; and the specification 
of linear systems resulting from the discretization procedure. Section 2.3. 


2.1. Trigonometric polynomials 

Given the order of the polynomial approximation 

AT G IM'^ such that Na is odd for all a, 

the space of R'^-valued trigonometric polynomials admits two equivalent definitions, e.g. [45, 
Chapter 8], 

= ■.v'^=^e&] = { Y, G R'"}. (6) 

fcez^ mez^ 

These definitions involve the set of truncated frequencies 

Z^ = |fcGZ'':|fc„|<^|, (7) 

the Fourier basis functions (pk from (5), and the fundamental trigonometric polynomials 

TN,m{x) = 7^ Y ^'^’^Tk{x) for ® G T, 
kez% 

where the complex-valued coefficients 

= exp ^27ri Y^ for m,k e J.% 
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define the discrete Fourier transform (DFT). 

Notice that the components and in (6) associated with a trigonometric polynomial 
un £ independent since they satisfy 

v^ = UN{k), v'^ = un{x'^), for x'^ = andk,meZ%, 

where denotes the regular grid in the real space IR'^ associated with the polynomial 

order N. Thus, the matrices vjv = and vjv = can be mapped on each 

other with the help of DFT 

vjv = Fjvvjv G \iN = 


where the matrices implementing the forward and inverse DFT are given by 


1 


-N = r^ASa^^^N )a,/3=i,. ,d ^ F J, Fjv = (^a/3Wjv e [C J; (8) 


-dxAriS 




^dxNl'i. 


6 ai 3 stands for the Kronecker delta equal to one for a = /3, and to zero otherwise. 

In the following sections, we shall make repeated use of the discretization operator Xjv : 

C#{y;RA 


In[v] = {v{x%)) 


k 


that assigns the values of a continuous periodic function v at the regular grid to 

the vector from The operator Xjv establishes a scalar product-preserving one-to-one 

map between and R'^^-^, with the inverse —>■ and can thus be used with 

advantage to evaluate the action of the bilinear form (2) on trigonometric polynomials. 


2.2. Galerkin approximations and bounds 

The conforming finite-dimensional space on which the Galerkin approximations will be per¬ 
formed consists of curl-free trigonometric polynomials with zero mean 

= Q[-^n] = <^GSrAj. 

The homogenized matrix Ah,at £ associated with the Galerkin approximation (Ga) to 

the corrector problem (1), then satisfies 

E, E'j = min a(£^-|- cat, + ejv) = o(T'+ (9) 

eArSejv 

for arbitrary E G R'^. The most straightforward approach to the exact integration in (9) utilizes 
the Plancherel theorem in the Fourier domain and leads to dense matrix representations, e.g. [46] 
or [28, Section 6]. However, sparsity is recovered when the integration is transferred to a double 
grid, as shown in [28, Section 6]: 

a[u]sf,VN) = (A2Ar-iUAr,2Ar-i, VAr,2Ar-i)|Rdx(2Jv-i), (10) 

with UAr, 2 Ar-i = E 2 N-i[un], VAr, 2 JV-i = E 2 N-i[vn], and the block-diagonal matrix A 2 Ar-i 
provided by 

TS-Dw, ( 11 ) 

where A{n) G denotes the n-th Fourier coefficient of A, recall (5). For pixel- or voxel-wise 
constant coefficients, A 2 JV -1 can be assembled efficiently by FFT, see [27, Section 4] for details. 
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Because the exact integration is relatively involved, we also introduce a simpler strategy based 
on the Galerkin approximation with numerical integration (GaNi). It employs the trapezoidal 
rule leading to discretization-dependent bilinear forms ajv : ^ 


a[uN,V]Sf) ~ aN{uN,VN) = ^ A{x%)un{x%)vn{x%) = (AjvUATjViv) 


[pdx AT 1 


where ujv = ^n[un], vjv = and the block-diagonal matrix Ajv = 

collects the coefficients at the grid points. In analogy to (1) and (9), the corresponding corrector 
problem amounts to finding the matrix Ah,at £ l^spcf defined by 


{Au,nE, = 


min clnIE + Bn, E + Bn) 
ejvS^jv 


ajv {E -|- ^ 



( 12 ) 


Furthermore, combining the two conforming Galerkin schemes (9) and (12) with the varia¬ 
tional principle ( 1 ), one immediately obtains the guaranteed upper bounds on the homogenized 
matrix 

{AuE,E)^,<{An,NE,E)^,<{A^°;;^‘'E,E)^, = a{E + B^^\E + B^^^) 'dEGR'^, (13) 

where the matrix Ah jv ^ ^spcf follows from the action of the bilinear form a from (2) to 

the GaNi minimizers bt^ evaluated with the exact integration formula (10), cf. [28, Section 6 ] 
and [27, Section 4], Note that analogous arguments establish guaranteed lower bounds on Ah 
according to the dual variational principle, finally leading to fully explicit discretization error 
bounds on the approximate solutions. In the present paper, however, we shall work with the 
upper bounds (13) only; an interested reader is referred to [28, 27] for full details. 


2.3. Linear systems 

The fully discrete versions of the optimality conditions for Ga (9) and GaNi (12) follow from 
suitable applications of the discretization operators, resulting in 

(A 2 JV—i 6 ^, 2 Ar —1 >i) |]^(ix( 2 iv-i) = ~ (AAr, 2 iV— 1 E 2 JV— 1 ) V 2 JV— 1 ) |pdx{ 2 jv-i)) (14a) 

(Aive^^ Vjv)|^dxAr = —(AatEjv, Vjv) pdXiV 1 (14b) 

for all vjv, 2 JV-i £ Ejv, 2 JV-i and vjv £ Rn, with 

^N,2N-l = E2 N~i[^n] C = In[£‘n] C 

®iV,2Ar-l = '^2N-i[^n'’] £ ^N,2N-1, = In[b^^'^] G Ejy, 

E 2 JV -1 = E2N-i[E] G R'^x(2JV-i), Ejv = In[E] G 


SO, for example, EAr, 2 Ar-i contains the nodal values of trigonometric polynomials from S’n at 
the double grid points 

To obtain the systems of linear equations defined by the optimality conditions (14), we need 
to enforce the constraints vjv, 2 JV-i £ EAf, 2 Ar-i and vn £ Ejv by means of suitable projections. 
Thanks to the properties of trigonometric polynomials. Section 2.1, such discrete orthogonal 
projections follow directly from the continuous version (4): 


’N,M 




with ( G 


’AT.M 


km 


0 for fc, m G Z^\Z^ 

fork,meZ%, 


(15) 


6 


so that, e.g. Gjv, 2 JV-i ^ |R'^x(2-^v~i) Ejv, 2 Ar-i; we abbreviate Gjv,iv to Gjy in what follows. 
Now, the linear systems corresponding to Ga and GaNi arise as, cf. [23, Proposition 12] and [27, 
Gorollary 28], 

Gjv, 2 Ar-iA 2 jv.^iE 2 Ar-i, (16a) 

GjvAatEat. (16b) 

Once the solutions to these linear systems have been obtained, the guaranteed upper bounds 
from (13) can be made explicit: 

= (A2iV-l(E2iV—1 + 6^^2Ar-l)j ^2JV—1 + 6^^2Ar-l)|R'ix(2JV-i)) (t'^n) 

{^H,N E,E)^a = (A2Ar-i77Ar,2Ar-i[EAr + e^^],77jv,2Ar-i[EAr + e^^])ij^rfx(2Jv-i)) (17b) 

where the prolongation operator 'R-n,2N-i = E 2 N -1 ° maps to R'^x(2.^v-i) through 

the intermediate space of trigonometric polynomials 


A 

>Ar,2JV-i^2Ar-iejY^2JV-i ~ “ 
GjvAatg^^ = — 


3. Solvers 

In order to avoid a profusion of notation, we shall refer to the linear systems (16) in a unified 
way as 

Cx = b for X e E, (18) 

so that, e.g., for the GaNi variant, C = GjyAjv, x = b = —GjyAjvEjv, and E = E^. We 
shall also abbreviate {•,»)^dxN or {•,»)^dx( 2 N^i) to IhllRdxw or ||•||j^dx( 2 jv-i) to ||•|| 2 , 

and E;v or E 2 Ar-i to E when there is no risk of confusion. 

Notice that in general, matrix C in (18) is non-symmetric and highly rank-deficient. However, 
as first demonstrated by Vondfejc et al. [41], C acts as a symmetric positive-definite matrix for 
vectors from the subspace E and satisfies 

ca||x ||2 < (Cx, x)^ < C'a||x ||2 for X G E, 

where ca and Ca are the bounds on the material coefficients A from (3); the condition number 
of C on E can be estimated from above independently of diseretization by k = CaIca (we invite 
an interested reader to refer to Section 4.1, where these properties are demonstrated on concrete 
examples). 

Because multiplication by C can be performed efficiently using the FFT in ©(jA/"] logjATj) 
operations, recall the definition of projection operator in (15) and (8), the problem (18) turns 
out to be well-suited to conventional iterative algorithms for symmetric positive definite systems, 
provided that all iterates generated by the algorithm, x^, remain in E. 

The remainder of this section is devoted to four such iterative solvers, covering general- 
purpose short-recurrence iterative algorithms, Sections 3.1-3.3; and a special-purpose solver, 
Section 3.4. In order to keep our exposition compact but self-contained, for each algorithm 
we present its pseudo-code; discuss the complexity of a single iteration, error estimates, and 
conformity of the iterates x^ G E; and briefly comment on their applications to Ga- and GaNi- 
based formulations in earlier studies. 

3.1. Richardson iteration 

The Richardson iteration [30] belongs to the group of stationary iterative methods. It 
searches for a fixed point of the mapping 

Xm = Xm_i — a;(Cxm-i — b) for m G IKI and Xq G E (19) 

by means of Algorithm 1. 
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Algorithm 1 (Richardson iteration). Input: C, b, xq = 0, ca, Ca, and e 

OJ <— ca+Ca choice) 

For m = 1, 2, • • • 

^m—1 OJ (Cx^_i b) 
until ||xm - Xm-ilb < e||E ||2 

return x^ 


A current implementation, one iteration of the Richardson solver involves three vectors, 
Xrri) Xm-i, and b, and one FFT-based matrix-vector multiplication Cx^-i that dominates its 
computational cost. The convergence analysis of the Richardson scheme is available, e.g., in [47] 
or [48, Section 4.2], where it is shown that the optimal choice of the iteration parameter 

2 

^ 

CA + Ca 

provides the error bound for the m-th iteration in the form 

Xm - x||2 < ( j ||xo - xjj^ for m G (Mq. 




To see that all iterates are conforming to E, we rewrite the iterations from Algorithm 1 as 


Xm = Xm-i — ojrm-i for m G IM and xq G E, 

where r^-i = Cx^-i — b is the (m — l)-th residual vector which is located in E, because both the 
matrix C and vector b involve the discrete projection operator; recall the linear systems (16). 
Hence, the approximate solutions are conforming, x^ G E, provided that the initial guess satisfies 
Xq G E. 

Rather interestingly, the Richardson iteration applied to the GaNi system (16b) yields the 
original variant of the Moulinec-Suquet scheme [11]; the optimal choice of the iteration param¬ 
eter (20) then corresponds to the results of convergence studies reported in, e.g. [12, 14, 15].^ 
Applicability for Ga-based discretization has been recently reported by Monchiet [29]. 


3.2. Conjugate gradient method 

The conjugate gradient method [31] constructs the iterates by projecting the system (16) on 
to a sequence of Krylov subspaces generated by the initial residual vector, 

Kjn = span {ro, Cro, C^ro, • • • , C™-“^ro} for m < dim(E), (22) 

utilizing a coupled two-term recurrence defined by Algorithm 2. 

Algorithm 2 (Gonjugate gradients). Input: C, b, xq, and e 
ro <— b - Cxo 

Po ^— •'0 

For m = 1, 2, • • • 

Q^m-l (fm-l) •'m-l)2/(^Pm-l) Pm-l)2 
Xm ^ Xm—1 T Om—iPm—1 

•"m ^ f’m—1 Q^m—iCPm—1 

/3m—1 ^ (f’mTm) 2 /1 Tm—l) 2 
Pm — Pm-lPm—l 
until ||rm ||2 < c||b ||2 

return Xm 


^Indeed, in the current notation, iterations of the Monlinec-Suquet algorithm are defined by the recurrence 
Xm = Xm-i — ToA/E + Xm-i), where To = G/cq stands for the (matrix) Green operator of the so-called reference 
problem with coefficient cqI. For the optimal choice cq = (ca -t Ca)/2, the two algorithms coincide. 
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In terms of storage, the Conjugate gradient method must keep track of three vectors, namely 
the solution x^, the residual r^, and the search direction p^. In addition, the product Cp^_^ 
must be calculated with the help of the FFT and stored to an auxiliary vector in order to 
keep the number of matrix-vector multiplications limited to one. Finally, notice that two scalar 
products are needed per a single iteration. 

As shown in, e.g. [49, 50], the error in the m-th iteration is bounded by 


X - Xr 


lc<2 


K — 1 


K + 1 


X - Xo 


c 


(23) 


Since each Krylov subspace from (22) satisfies C E, the iterates x^ are conforming, as hrst 
observed in [41]. 

To the best of our knowledge, the first heuristic applications of the conjugate gradient method 
to systems associated with GaNi discretization was reported independently by Brisard and 
Dormieux [51] and Zeman et al. [22], and justified later by Brisard and Dormieux [25] and 
Vondfejc et al. [41, 23]. Performance of the solver for the Ga system has been recently studied 
by Vondfejc [27], but no comparison with other iterative solvers has been made to date. 


3.3. Chehyshev semi-iteration 

The Chebyshev semi-iteration [32, 52] builds upon a generalization of the Richardson iterative 
formula (19) 

— ^m —1 + LUmrm-i for m G IM and xq G E, (24) 


where uJm is related to the roots of the Chebyshev polynomials of the first kind, shifted to 
the interval [ca,Ca\ and suitably normalized. In our numerical experiments, Um is determined 
indirectly from a composite two-term recurrence according to Algorithm 3, because this relation 
proved to be the most numerically stable from the variants available in [53]. 


Algorithm 3 (Chebyshev iteration). 


ro 

C i 

d< 
a • 

/?• 

I'D 

Po 


- b — Cxq 

UCa-ca) 
^(ca + Ca) 


2 \d) 

b — Cxo 
- •'0 

For m = 1, 2, 
if m > 1 


a 


/3 


-1 


r, 

Xi 

P 

until IIr 
return x 


— (^d 

_ _/ ca\ ‘^ 

\ 2 ) 

I'm—1 ~ ^^Pm—1 
1 ^Pm—1 

fm - /3Pm-l 

bjb 


m|l2 — ^ 


Input: C, b, xq, ca, Ca, and e 


Similarly as in the case of the Conjugate gradient algorithm, the Chebyshev semi-iteration 
updates the triplet (x^, r-m, Pm) every iteration and requires only one matrix-vector product. 

As discussed, e.g., in [54, Section 2], the Conjugate gradient estimate (23) holds also for 
the Chebyshev method and actually better reflects its true convergence behavior. By (24), 
all approximations Xm are conhned by E. Note that, to the best of our knowledge, this work 
represents the first application of the Chebyshev method to FFT-based homogenization. 
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3.4-- Eyre-Milton accelerated scheme 

The basic idea of the Eyre-Milton algorithm [12] is to recast the system (16) into an equivalent 
form with better conditioning and to solve the modified system by the Richardson scheme for 

f pi\ 

the unknown in the form x = Ejv + resulting scheme appears in Algorithm 4. 

Algorithm 4 (Eyre-Milton accelerated scheme). Input: A, G, xq = E, ca, Ca, and e 
id <— \/caCA (Optimal choice) 

P ^ (A + wl)"^ - 2G]f) (A - wl) 

q = —2uj{A -|- a;l)“^xo 

For m = 1, 2, • • • 

Xm — PXm—1 -|- q 
until jjxm - Xm-ilb < e||E||2 

return x^ 


The storage requirements for the Eyre-Milton scheme are similar to those for the Richardson 
iteration and involve x^-i, and q. Multiplication with matrix P is more demanding than 
with C, cf. (16), but its cost is dominated again by the forward and inverse EFT, and thus has 
the same complexity of OdAlj log | Alj). 

The relative error bound for x^ [15] 

II II /^-1\”*|| II , , 

||Xm - x]|2 < I J 


is similar to the Conjugate gradient and the Chebyshev estimates (23). In addition, the iterates 
generated by this algorithm are not conforming in the sense that (x — E) 0 E, see [33]. 

According to a seminal study by Moulinec and da Silva [33] , the method outperforms other 
accelerated schemes available in the literature, namely the augmented Lagrangian formulation by 
Michel et al. [13, 14] and the polarization-based method by Monchiet and Bonnet [16] in the GaNi 
setting. However, the effect of numerical integration is very pronounced for this algorithm; 
quite surprisingly, we found that the method does not converge when matrices in Algorithm 4 
correspond to Ga discretization. We attribute this behavior to the fact that coefficient matrix 
for GaNi, Ajv, still satisfies the estimates for the continuous problem (3), 

CA||vAr||Kdxiv < (AivVjv,VAr)K<ixJv < C'A||vjv||RdxJv for all vjv G 


However, a similar condition no longer holds for A 2 iv~i; because the exact integration impacts 
its eigenvalue distribution; see Section 4.1 for an explicit example. 


4. Examples 

Two types of unit cells appear in the comparative study. These include a single square 
inclusion of 36% volume fraction. Figure 1(a), and a two-dimensional cross-section of an alkali- 
activated ash foam sample analyzed in [55], Figure 1(b). Unless specified otherwise, the coeffi¬ 
cients of the cell problem (3) are taken as 

1 in matrix, 

100 in inclusion, 

for the square inclusion and the fly ash foam cell, respectively. The default grid of the square 
inclusion is set to 85 x 85, since the same convergence behavior was observed for finer discretiza¬ 
tions, while the alkali-activated foam sample corresponds to a 1,999 x 1, 999 pixel bitmap. The 
macroscopic field is set to E = [1;0] and the tolerance of all four algorithms is e = 10“®. 



A = 


2.6 in matrix (black), 
0.026 in inclusion (white). 


(26) 
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Figure 1: (a) An example of square inclusion discretized with 5x5 points and (b) 1,999 x 1, 999 pixel bitmap of 
an alkali-activated fly-ash sample (courtesy of Petr Hlavacek, CTU in Prague). 


4-.1. Eigenvalue distribution 

In order to obtain the complete eigenvalue distribution, the unit cells from Figure 1 were 
discretized with trigonometric polynomials of order 15 x 15, leading to matrix sizes d\N\ x d\N\ = 
(2 • 15 • 15) X (2 • 15 • 15) = 450 x 450 for GaNi discretization and d\2N — 1| x d\2N — 1| = 
(2 • 29 • 29) X (2 • 29 • 29) = 1,682 x 1, 682 for Ga. 

The cumulative distribution of eigenvalues for the two unit cell types and the two discretiza¬ 
tions appear in Figure 2. For GaNi discretization, the matrix has a rank of 224, whereas the 
remaining 226 eigenvalues are zero. It further follows from the discussion in [27, Remark 30] 
that out of 226 zero eigenvalues, 2 correspond to the constant fields and 224 represent trigono¬ 
metric polynomials of zero divergence and zero mean. For Ga, the null-space is increased by 
1, 232 eigenvectors resulting from the double-grid projection (15); notice that the corresponding 
eigenvalues are not included in Figure 2 for better clarity. 

Figure 2 illustrates how cell geometry and numerical integration influence the matrix spec¬ 
tra. In particular, for the square inclusion and GaNi, Figure 2(a), the eigenvalues in spectrum 
[1; 100] form clusters. For instance no eigenvalues are present between 8 and 25 because the 
cumulative distribution is constant on this interval. Exact integration renders the spectrum less 
clustered, Figure 2(b), but the effect of geometrical irregularity is even stronger. Figure 2(c), so 
that the interval [0.026; 2.6] is sampled uniformly with the eigenvalues of the matrix resulting 
from full integration applied to the fly ash foam cell. Figure 2(d). Finally, observe that irre¬ 
spective of the discretization used, the non-zero eigenvalues are bounded by the coefficients of 
the phases according to (26); the matrix C is thus indeed symmetric positive-definite on E. The 
extreme eigenvalues have the highest multiplicity as indicated by the corresponding jumps in 
the eigenvalue distribution. 

Figure 3 demonstrates the effect of numerical integration on matrices of coefficients Ajv (of 
rank 450) and A 2 JV -1 (of rank 1,682) for GaNi and Ga schemes and the square inclusion. In 
this case, the effects of integration are even more pronounced. For GaNi scheme, the spectrum 
consists of two values {1,100}, because matrix Ajv contains only coefficients of the continuous 
problems (3) sampled on a regular grid. For Ga scheme, the exact integration (11) significantly 
changes the spectrum of A 2 JV- 1 ; the eigenvalues are now located within interval [—3.89,108.89]. 
We conjecture that divergence of the Eyre-Milton scheme for Ga, reported in Section 3.4, is a 
direct consequence of this fact. 
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Square inclusion 



Fly ash foam 



Figure 2: Cumulative eigenvalue distribution for system matrices resulting from GaNi (left) and Ga (right) 
discretizations; non-zero eigenvalues are binned into 100 intervals. 


4-2. Residual norm 

We start our study on the convergence properties of FFT-based solvers by investigating the 
evolution of the residual norm 


||rm||2 = ||b - Cxm,||2 for m G IMo 

during iterations, see Figure 4. Results reveal that the Richardson scheme and Conjugate 
gradients display behavior qualitatively different from the Chebyshev method and the Eyre- 
Milton method. For the first two algorithms, convergence proceeds in two stages: the first 
stage (~ 10 iterations) is associated with a rapid decrease in the residual norm, and then slows 
down in the second stage. These two stages are especially pronounced for the GaNi-discretized 
square inclusion. Figure 4(a), but they are clearly visible in all remaining examples. 

The Chebyshev and the Eyre-Milton schemes display almost the same behavior along the 
whole iteration process. We attribute this behavior to the fact that the first group of algorithms 
initially resolves the components of the residuum vectors associated with the largest eigenvalues 
and then proceeds through the rest of the spectrum down to the smallest eigenvalue, cf. [56]. The 
second group of algorithms, on the other hand, simultaneously reduces the residuum components 
associated with the full spectrum, as confirmed by our computational observations that will form 
the basis of a separate publication. 

Except for the Richardson scheme, all methods display oscillations in the convergence plots 
for regular geometry and GaNi, Eigure 4(a), that are significantly dampened by the exact in¬ 
tegration and/or irregular distribution of phases. These phenomena again closely follow the 
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Figure 3: Cumulative eigenvalue distribution for coefficient matrices resulting from GaNi (left) and Ga (right) 
discretization of square inclusion; eigenvalues are binned into 100 intervals. 


eigenvalue distributions displayed in Figure 2. We also observe that the behavior of the Cheby- 
shev method and Eyre-Milton scheme are almost identical for the simple geometry; for the fly 
ash foam microstructure, the Eyre-Milton method is more efficient and the residual norms gen¬ 
erated by the Chebyshev method increase in the first iteration. At later iterations, the rate 
of convergence of the Richardson scheme is inferior to the remaining algorithms, which exhibit 
super-linear convergence in accordance with error estimates (21), (23), and (25) reported earlier 
in Section 3. Nevertheless, the conjugate gradient method always displays the best performance. 

4-3. Guaranteed upper bound 

The numerical experiments in the previous section concentrated on the residual norm because 
it is used as the stopping criterion in conventional solvers. Here, we shall consider the evolution 
of the guaranteed upper bound at the m-th iteration, 

UE(^rri) — (^^2N—l{^2N—l T ^m), E 2 JV—1 T ^m) (2iV—1 ) ^Or 171 G INIq, (^'^) 

where the mapping Ue '■ '^n, 2 N-i IR provides the average energy in the unit cell with the 
distribution of local fields (El -|-X^_j^[xm]). If the iterates x^, G ^n, 2 N-i correspond to the Ga 
scheme, UE{^m) —^ recall (17a). Eor the GaNi scheme, the iterates x^ must 

be projected to Ejv, 2 Ar-i to obtain UE{T^N, 2 N-i[^N^rn]) —^ (^h,jv E,E)^^, recall (17b). 
Notice that for xq = 0, Eq. (27) corresponds to the Voigt estimate. 

The behavior of individual solvers, Eigure 5, agrees well with the observations made from the 
residual plots, Eigure 4; in particular the estimates on the guaranteed upper bound generated 
by the Richardson scheme and by the Conjugate gradient method converge faster than the 
Chebyshev and Eyre-Milton schemes in the first ~ 10 iteration, after which the bounds appear 
to stabilize. 

In order to investigate the quality of the upper bound in more detail, we introduce the error 
in the guaranteed upper bound at the m-th iteration 

\UE{y^m) - Ue{'x*)\ for m G iNJo 

where x* denotes the approximation to the solution of (18) obtained with the Conjugate gradient 
method and tolerance e = 10“^^. The results are collected in Eigure 6 and demonstrate that 
to achieve a target accuracy of 10“^, for instance, the Conjugate gradient algorithm needs 
less than 25 iterations for GaNi discretization and about 15 iterations for Ga, whereas the 
Chebyshev and Eyre-Milton methods require about 10 additional iterations to reach the same 
accuracy (except for fly ash foam microstructure and GaNi discretization, where the methods 
perform similarly). The iterates generated by the Richardson scheme deliver an accuracy of 
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Figure 4: Convergence history of residual norms for GaNi (left) and Ga (right) discretizations. 


10“^ only for Ga discretization and 70 iterations, Figure 6(b), or 60 iterations. Figure 6(d). The 
superior performance of the Conjugate gradient method in the Ga setting is not surprising; the 
upper bound (27) is exactly the energy norm that Conjugate gradients minimize over the Krylov 
subspaces (22), as has been recently pointed out by Vondfejc [27]. 

Being inspired by recent results in [27], we also find it instructive to demonstrate the effect 
of exact numerical integration on the tightness of the guaranteed upper bounds. Results in 
Figure 7 confirm that these effects are indeed significant. For square inclusion and GaNi, the 
bound equals to 2.793 and decreases by 20% to 2.241 by the exact integration; for the fly ash 
foam cell, GaNi provides a value of 0.513 that is improved to 0.469 by Ga. We believe that these 
results promote Ga discretization over GaNi, despite the increased computational costs due to 
involvement of a double grid. 

4.4- Non-conformity 

We shall quantify the non-conformity of iterates for GaNi-based systems by the quantities 

+ Ejv 11 jY ) ^rrtllj^dxJV ^ ^ ^0) 

defined for the Eyre-Milton scheme and the remaining algorithms, respectively, with an obvious 
generalization to the Ga setting. The reason for this distinction is best seen in Figure 8(a), 
which highlights the difference between the non-conforming Eyre-Milton scheme and conforming 
solvers - the Richardson, the Conjugate gradient, and the Chebyshev algorithms. Eor the 
Eyre-Milton scheme, the non-conformity error reaches its maximum in the first iteration and 
progressively decreases in a non-monotone way. For conforming algorithms, the error increases 
with an increasing number of iterations due to the accumulation of round-off errors. Notice that 
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Figure 5: Convergence history of guaranteed upper bounds for GaNi (left) and Ga (right) discretizations. 


although the round-off effects per iteration are smallest for the Richardson scheme, the total 
values are the same at convergence, Figure 8(b). 

As follows from the discrete Helmholtz decomposition of [28, Lemma 21], the non¬ 

conformity error of the Eyre-Milton scheme can be further split into two orthogonal components. 




+ E_/v ~ X. 


m II [pdxiv 


= ||Ejv - (x. 


mj II pdxiv 


+ II + (Xm)|| 


[pdXiV ? 


which quantify the difference between the mean value of the m-iterate, defined with (x]^) = 
{Ylik&T% ^ ^ prescribed vector Ejv, and the distance from zero-mean 

curl-free vectors E^. The relative distribution of both components. Figure 9, demonstrates that 
the latter component dominates the error in mean. 


4-5. Relative error bound 

In support of the theoretical results gathered in Section 3, our purpose is to illustrate the 
behavior of the relative error 

11 Xm ~ ^ 11 C r ^ I 

-n-rr-^ for m G EIq. 

||xo - X*||j, 

The results of this study appear in Figure 10 for the GaNi-based system and confirm the linear 
convergence of the Richardson scheme implied by error estimate (21) and the super-linear con¬ 
vergence of the remaining three algorithms, in agreement with relations (23) and (25). Notice 
that no results for Ga discretizations have been shown, since they are very similar to the error 
plots for the guaranteed upper bound. Figure 6(b,d), and lead to the same conclusions. 
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Figure 6: Convergence of error in guaranteed upper bound for GaNi (left) and Ga (right) discretizations. 


4-6. Condition number 

To conclude our numerical experiments, in Figure 11 we plot the dependence of the number 
of iterations to reach the accuracy of e = 10“® on the condition number of system matrix k. 
Note that we present the results only for the square inclusion, Figure 1(a), and set the coefficient 
of the inclusion equal to k (analogous conclusions hold for more complex microstructures). In 
addition, all algorithms are terminated by the relative residual norm. 

The results generally follow the trend predicted by the error estimates (21), (23), and (25) 
and confirm the linear scaling 0{k) for the Richardson scheme and for the remaining 

solvers. Closer inspection reveals that some solvers converge faster for large values of k, namely 
the Conjugate gradient method or the Richardson scheme for Ga discretization. Such supercon- 
vergent behavior was reported in earlier studies by, e.g., Monchiet and Bonnet [16] and Willot et 
al. [18] for FFT-based solvers, and Schneider et al. [57] for finite difference methods, in order 
to demonstrate that their methods work for k —)• oo. On the basis of the presented results, one 
may thus conjecture that iterative methods applied to Ga-based systems will converge for the 
infinite contrast of material coefficients. We would like to address this question in our future 
work, together with a more detailed investigation into the distribution of discretization errors. 

5. Conclusions 

In this paper, we have performed a comparative study of iterative algorithms for systems of 
linear equations arising from a Fourier-Galerkin discretization of the periodic corrector problem. 
Two discretization schemes have been considered, exact (Ga) and trapezoidal (GaNi) integra¬ 
tion, and the ensuing systems of linear equations have been solved with the Richardson, the 
Chebyshev, the Conjugate gradient, and the Eyre-Milton algorithms. 
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Figure 7: Effect of numerical integration on tightness of guaranteed upper bound for (a) square inclusion and 
(b) fly ash foam microstructures. 




Figure 8: Evolution of non-conformity norm during iterations for square inclusion and (a) GaNi and (b) Ga 
discretizations. 

Based on the outcomes of our study, summarized in Table 1 for the reader’s convenience, we 
conclude that: 

(i) In terms of the rate of convergence, the Conjugate gradient, the Chebyshev, and the Eyre- 
Milton algorithms exhibit super-linear convergence and the Richardson method converges 
with the linear rate. In addition, the Conjugate gradient method appears to be the most 
efficient solver, while the Chebyshev and the Eyre-Milton algorithms display comparable 
performance. 

(ii) All three general-purpose solvers — the Richardson, the Conjugate gradient, and the 
Chebyshev algorithms — generate iterates that conform to E, the space associated with 
curl-free and zero-mean trigonometric polynomials. The Eyre-Milton method produces 
non-conforming iterates. 

(iii) The general-purpose solvers work for linear systems arising from both Ga and GaNi dis¬ 
cretizations, while the Eyre-Milton method is applicable exclusively to the GaNi setting. 

(iv) The approximate upper bounds generated by the Richardson and the Gonjugate gradient 
methods for the Ga discretizations exhibit monotone convergence, whereas all other options 
yield non-monotone convergence. The superior performance of the Conjugate gradient 
algorithm follows from the fact that the bound corresponds to the energy norm that 
Conjugate gradients naturally minimize. 

(v) With regard to memory efficiency of the implementations introduced in Section 3, the 
Conjugate gradients need to store one additional vector per iteration. The computational 


17 


































Figure 9: Relative distribution of non-conformity norm in the Eyre-Milton algorithm for the square inclusion 
problem and GaNi discretization. 



Figure 10: Performance of error estimates for GaNi discretization and (a) square inclusion and (b) fly ash foam 
microstructures; error bound is dehned by (23). 


complexity of a single iteration of all algorithms is comparable because it is dominated by 
the forward and inverse FFTs. 

We believe that the computational observations collected in this paper provide a convenient 
starting point for the development of even more efficient solvers for FFT-based homogenization 
algorithms. Our particular interest is to continue to explore the potential of the Chebyshev 
method to achieve a more robust convergence, e.g. [58], or to serve as a preconditioner in 
multi-grid solvers [59]. Investigations of these topics are underway and results will be reported 
separately. 
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Figure 11: Dependence of the number of iterations to convergence on the condition number of the system matrix 
for square inclusion and (a) GaNi and (b) Ga discretizations. 


Algorithm 

Scaling 

Conforming 

Ga 

Upper bound 

Storage 

Richardson 

0{k^) 

Yes 

Yes 

Monotone 

3 

Conjngate gradients 

0{k2) 

Yes 

Yes 

Monotone 

4 

Chebyshev 

0{k-2) 

Yes 

Yes 

N on-monotone 

3 

Eyr e-Milton 

0{k^) 

No 

No 

N on-monotone 

3 


Table 1: Overall comparison of the four short-recurrence iterative algorithms; scaling indicates how the number 
of iterations to converge grows with an increasing condition number of system matrix k, conforming algorithms 
generate iterates from subspace E, Ga refers to the extendibility of the algorithm to the exact integration, upper 
bound indicates how the approximate upper bounds generated by the iterates converge to the guaranteed upper 
bound, storage specifies the number of vectors needed in one iteration. 
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